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Abstract 

There have been many matching pursuit algorithms (MPAs) which handle the sparse signal recovery 
problem a.k.a. compressed sensing (CS). In the MPAs, the correlation computation step has a dominant 
computational complexity. In this letter, we propose a new fast correlation computation method when we 
use some classes of partial unitary matrices as the sensing matrix. Those partial unitary matrices include 
partial Fourier matrices and partial Hadamard matrices which are popular sensing matrices. The proposed 
correlation computation method can be applied to almost all MPAs without causing any degradation of 
their recovery performance. And, for most practical parameters, the proposed method can reduce the 
computational complexity of the MPAs substantially. 

Index Terms 

compressed sensing (CS), fast correlation computation, Fourier matrix, Hadamard matrix, matching 
pursuit algorithm (MPA). 

I. Introduction 

Compressed sensing (CS) is a novel sampling technique, where one can recover sparse signals from 
the undersampled measurements HI. In a typical CS problem, the goal is to exactly reconstruct the N x 1 
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K-sparse signal vector x based on the M X 1 measurement vector y. By K-sparse we mean that there 
are at most K nonzero elements in x. The vectors x and y are linearly related to each other as 

y = <f>x + rj, (1) 

where <3? is the M x N sensing matrix and rj is the M X 1 noise vector. And the relation of if, M, and 
AT is generally K < M < iV. 

For the sensing matrix <I>, partial Fourier matrices and partial Hadamard matrices are popular sensing 
matrices, where we mean that the partial matrix is constructed by some M rows of N x N original 
matrix A. In other words, $ = SqA, where Sq is the M x N row selection matrix consisting of M 
rows (indices from some index set Q) of N x N identity matrix I. 

Firstly, the partial Fourier matrix is frequently used because of its good recovery performance, fast 
implementation using the fast Fourier transform (FFT), and applicability to practical signals. The examples 
include channel estimation in communication systems [2] and magnetic resonance imaging (MRI) [3]. For 
the partial Fourier matrix, the index set f2 can be constructed randomly or based on the cyclic difference 
set S. 

Secondly, some recent researches showed that well-designed deterministic sensing matrices based on 
linear block codes have better performance and less complexity for signal recovery compared to random 
sensing matrices Q, 10. It is well known that a sensing matrix whose columns are bipolar-presented 
codewords of a binary linear block code can be viewed as a partial Hadamard matrix. And we can exploit 
the efficiency of the fast Hadamard transform (FHT). 

To recover x in ([I]), matching pursuit algorithms (MPAs) find a sparse estimation of the signal x from y 
in a greedy fashion. It works iteratively by choosing the component that has the highest correlation with the 
current residual. Examples include the orthogonal matching pursuit (OMP) [7] and its modified versions 
such as the compressive sampling matching pursuit (CoSaMP) [8], the regularized OMP (ROMP) [9], 
the subspace pursuit (SP) [ 101, and the backtracking-based matching pursuit (BB MP) [11]. For instance, 
we summarize the OMP which is the most basic algorithm among the MPAs. The steps marked by 
are the common steps to the MPAs. 



Algorithm 1.1 Conventional OMP recovery algorithm 



1) Initialize : ro = y, Aq = 0, t = 1. 

2) Correlation computation : ht-i = $ rt-x- 
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3) Identification : X t = arg maxj =1 N \ht-i(j)\. 

4) Augment the index set : A t = A t _i U {Xt}. 

5) Construct $ t : $ t = ^S 1 ^. 

6) Least squares : x t = (<J>^<I> t ) _1 <J>£V 

7) Update current residual : at = &t%t, n = y — a t . 

8) t = t + 1, return to 2) if the halting criterion is not triggered. 

In Algorithm 1.1, performing ht-\ = <& H r t -i in 2) can be viewed as computing the correlations 
between the current residual r t -\ and the columns of <£. And we denote h t -\ as the correlation vector 
at the £-th iteration. The whole computational complexity of the OMP is dominated by the correlation 
computation step and so are the other MPAs'. 

In this letter, we propose a new fast correlation computation method which can be applied to almost all 
MPAs including OMP, CoSaMP, ROMP, SP, and BB MP. The recovery performances of the MPAs applied 
by the proposed method are exactly the same as those of the original MPAs. And, for most practical 
parameters, the proposed method can reduce the computational complexity of the MPAs substantially. 
The proposed method can operate only when the sensing matrix is the partial unitary matrix satisfying 
the following two constraints : 

1) Every element of the unitary matrix U has the magnitude 1/y/N. 

2) The set {y/Nu\, VN112, ■ ■ ■ ,VNun}, where u n is the n-th column of U, is closed under element- 
wise multiplication o. 

At a glance, the above constraints seem to be too strict, however, the Fourier matrix and the Hadamard 
matrix are two kinds of the unitary matrices with these constraints. Therefore, the proposed method is 
meaningful and it can be widely adopted in CS. 

II. A New Fast Correlation Computation Method for MPAs 

In this section, we describe the proposed fast correlation computation method for general MPAs. 
The MPAs have the common steps marked by in Algorithm 1.1 and we derive the fast correlation 
computation method based on only those steps. In the following derivation, U is the unitary matrix 
satisfying the two constraints and the sensing matrix is $ = S^U . For simplicity, we handle not the i-th 
iteration but the (t + l)-th iteration. 
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Using the steps 5) and 7) in Algorithm 1.1, the correlation computation step 2) at the (t + l)-th iteration 
ht = & H r t can be rewritten as 

h t =U H Slr t 

=U H Sl{y-a t ) 
=U H Sly - U H Sla t 



ho - U H S£S n USlx t , (2) 



where h = & H r = U H Sj l y. 

In (|2j), S^xt can be represented as 

|A t | 

S L X t = ^2 X t( T ) e A t (r), (3) 

T=l 

where e^Ar) * s trie Af(r)-th column of I, \K t \ is the cardinality of the index set At, and xt(r) is the 
r-th element of xt- By using Q and ([3]), we obtain 

|A t | 

ht =h - U H SnSnU^2x t (T)e At ( T) 

T=l 

|At| 

=^0 - £ xt(r)U H SESnUe At{T) . (4) 

T=l 



Without loss of generality, we assume the first column of U is (1/VJV, 1/ViV, • • • , WN) T . And (|4 
can be rewritten as 

|A t | 

h t = h -Y,xt(T)U H S£SnD Mr) U ei , (5) 

T=l 

where D A j T \ = y/N ■ diag(«yv t (r)) an d u A t (r) i s the A t (r)-th column of U. Because the matrix S^Sq 
is the diagonal matrix, SqSciD^^ = D^^SqSci and ^ can be rewritten as 

I At | 

h t = ho-^x t (r)U H D At{T) sESnUe 1 . (6) 

T=l 

We denote P\j T ) = U H D At ( T ^U and thus P At ^U H = XJ D A t T \. Consequently, the correlation 
computation at the (t + l)-th iteration can be expressed as 

|A t | 

h t =h - £ x t (T)P AtiT) U H SESnU ei 

T = l 

I A«| 

=ho ~ ^2 x t (T)P At{T) c, (7) 

r=l 
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where c = U H S^SnUei which is called the correlation kernel vector. Note that the correlation kernel 
vector c is independent to the sparse signal vector x and thus can be stored in advance. The matrix 
P At ( r ) in (jTJ) is a permutation matrix according to the following theorem. The permutation matrix can be 
performed with negligible computational complexity because of its structure. 

Theorem 2-1 : PaJt) = U H D A j T \U is a permutation matrix (i.e., a square binary matrix that has 
exactly one element 1 in each row and each column and Os elsewhere) if the unitary matrix U is under 
the two constraints. 

Proof of Theorem 2-1 : U H D At(r) U can be expressed as 

U H D At{j) U = (m ••• u N ) H ■ VN • diag(u At(r) ) • (m ••• u N ) 

-=-(u! ••• u N ) H ■ (VNu At ( T ) o \/Nui ••• VNu At ( T ) o VNu N ^J . (8) 



/ iVitA t ( T ) ° VNu n , n = 1, • • • , N, are distinct column vectors because their elements are nonzero by 
the first constraint of U. And each vector belongs to the set {y/Nui, VNu2, ■ ■ ■ ,\^Nun} because of 
the second constraint of U. Therefore, ([8]) can be rewritten as 

U H D At(r) U =4=-(«l ••• u N ) H -(yNui ••• VNu N ~) • P At(r) 



= (ui ••• u N ) H ■ (ui ••• un) ■ P\ t (T) 

=IP Mt) =P Mt) , (9) 

where P\j T ) i s trie permutation matrix which is determined by A 4 (r) and the structure of U. □ 

To sum it up, the correlation computation at the t-th iteration (i.e., computing ht-i) can be performed 

by | A* 1 1 subtractions of properly scaled and permutated versions of the correlation kernel vector c to 

the initial correlation vector ho. 

III. Fast OMP Recovery Algorithm 

In this section, we apply the fast correlation computation method to the conventional OMP. And we 
discuss the complexity of the proposed OMP algorithm applied by the proposed method. Applying the 
proposed correlation computation method to other MPAs is straightforward and entirely analogous with 
this section. 
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A. Fast Correlation Computation for the OMP 

The proposed correlation computation method Q for a general MPA can be easily converted for the 
OMP as 

$ H r , t = 1 



ht- 



t-i 



(10) 

And the proposed OMP recovery algorithm can be given by simply replacing the correlation computation 



h ~ Et=l X t -l(T)P\ T C, t > 1. 



step 2) in Algorithm 1.1 with (10 1. 



Note that the proposed OMP algorithm is actually identical to the conventional OMP algorithm. The 
only difference is the computation method and thus the proposed OMP guarantees the same recovery 
performance compared to the conventional OMP. 

B. Complexity Analysis 

In this subsection, we investigate the computational complexity of the proposed OMP algorithm in the 
cases of using the partial Fourier matrix and the partial Hadamard matrix. Firstly, for each matrix, we will 
discuss the properness of the proposed algorithm in terms of the storage requirements for the permutation 
matrices. Secondly, we compare the computational complexity of the proposed OMP algorithm to that 
of the conventional OMP algorithm. For the exact comparison, we consider the number of flops of each 
algorithm. And we regard one complex multiplication as 6 flops and one complex addition as 2 flops. 

We remark that the proposed OMP performs the (t — 1) subtractions of properly scaled and permutated 
versions of the correlation kernel vector at the i-th iteration to compute the correlation vector in the second 



equation in (10). We consider the case when N is a power of two, which is used very often in signal 
processing. But, the proposed method can be used for any N. 

1 ) Storage Requirements Using the Partial Fourier Matrix: When we use the partial Fourier matrix as 
the sensing matrix, from the discrete Fourier transform (DFT) properties, P\ t is the matrix which cycli- 
cally shifts c when P\ t is multiplied with the vector c from the left. Therefore, the storage requirements 
for the permutation matrices can be negligible. 

2) Computational Complexity Using the Partial Fourier Matrix: It is well known that the correlation 
computation at each iteration in the conventional OMP can be implemented by the iV-point FFT The 
iV-point FFT requires 5N log 2 N flops. 

For the proposed OMP, in ( [TO] ), the first iteration (t = 1) can be implemented by one FFT. And, when 
t > 1, the correlation vector is computed by using the correlation kernel vector. Exploiting the conjugate 
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TABLE I 

Comparison Between the Proposed OMP and the Conventional OMP (# of flops at the i-TH iteration) 



$ : Partial Fourier matrix 


t 


= 1 


> 1 


Conventional OMP 


5N\og 2 N 


5Nlog 2 N 


Proposed OMP 


5Nlog 2 N 


6N(t- 1) 




$ : Partial Hadamard matrix 


t 


= 1 


> 1 


Conventional OMP 


2A/log 2 7V 


2Nlog 2 N 


Proposed OMP 


2iVlog 2 iV 


2N(t — 1) 



symmetric property of the correlation kernel vector using the partial Fourier matrix as the sensing matrix, 



performing the second equation in ( lOl has the cost of (N/2){t— 1) complex multiplications, 2(N/2){t— 1) 
real additions, and N(t — 1) complex additions at the t-th iteration. Aggregately, the proposed OMP 
requires 6N(t — 1) flops at the t-th iteration. 

3) Storage Requirements Using the Partial Hadamard Matrix: The storage requirements of the pro- 
posed OMP algorithm with partial Hadamard matrix are also favorable. There is no need to store the 
entire N permutation matrices. If we store only the log 2 N permutation matrices, the permutation matrix 
P\ T for any A T can be easily performed by sequentially applying some matrices among the stored log 2 N 
permutation matrices. It is easily induced from the properties of the Hadamard matrix. 

4) Computational Complexity Using the Partial Hadamard Matrix: It is well known that the correlation 
computation at each iteration in the conventional OMP can be implemented by the N -point FHT. The 
TV-point FHT requires iVlog 2 iV complex additions (i.e., 2A r log 2 N flops). 

For the proposed OMP, in ( fT0| ), the first iteration (t = 1) can be implemented by one FHT. And, when 
t > 1, the correlation vector is computed by using the correlation kernel vector. Because the correlation 
kernel vector consists of only a small number of values compared to N using the partial Hadamard matrix 



as the sensing matrix, performing the second equation in (10 1 has approximately the cost of N(t — 1) 
complex additions. Table I summarizes this subsection. 

IV. Numerical Analysis 

Here we present some numerical results characterizing the performance of the proposed OMP algorithm 
compared to the conventional OMP algorithm. The results were produced using the partial Fourier matrices 
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and the partial Hadamard matrices with practical and various sizes. And we plot the computational 
complexities for 1 < t < 13, which is reasonable for given M and N. 




Fig. 1. Relative computational complexity of the proposed OMP compared to the conventional OMP at the t-th iteration when 
the partial Fourier matrices are used. 

Fig. [1] shows the relative computational complexity of the proposed OMP compared to the conventional 
OMP when the partial Fourier matrices are used. Because the computational complexity of the proposed 
OMP algorithm at the t-th iteration is proportional to t — 1, there is a excessive point and thus adaptive 
strategy is needed. For instance, for M = 64 and N = 4096, t = 11 is the excessive point and the 
conventional OMP can be used from the 12-th iteration. Consequently, the proposed OMP algorithm 
has a benefit to reduce the computational complexity substantially. Especially, for large N, the proposed 
OMP has a good benefit. 
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t 

Fig. 2. Relative computational complexity of the proposed OMP compared to the conventional OMP at the t-th iteration when 
the partial Hadamard matrices are used. 

Fig. [2] shows the relative computational complexity of the proposed OMP compared to the conventional 
OMP when the partial Hadamard matrices are used. Like the case of using the partial Fourier matrices in 
Fig. [T] the proposed OMP algorithm has a benefit to reduce the computational complexity substantially. 
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Fig. 3. Relative computational complexity of the proposed CoSaMP compared to the conventional CoSaMP at the t-th iteration 
when the partial Fourier matrices are used. 

Besides the proposed OMP, we present the numerical result when the proposed correlation computation 
method is applied to the CoSaMP [8]. Due to lack of space, we leave out the detailed description of 
the proposed CoSaMP Fig. [3] shows the relative computational complexity of the proposed CoSaMP 
compared to the conventional CoSaMP at the t-th iteration. We use the partial Fourier matrices as the 
sensing matrix. Different to the proposed OMP, the proposed CoSaMP requires the same computational 
cost at each iteration except when t = 1. Especially, the proposed CoSaMP algorithm has a good benefit 
for small K and large N. For instance, when M = 64, N = 8192, and K = 4, the proposed CoSaMP 
requires only the 37% computational cost compared to the conventional CoSaMP. 
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